You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some embedding tests were failing because we were trying to store multi-dimensional embedding arrays in the RawData using a DataFrame. The fix changes the raw data storage from DataFrame to dictionary to properly store the embeddings.
This pull request introduces enhancements to the StabilityAnalysis modules and optimizes data handling in the utils.py file. The key changes include:
Return Value Enhancement: The return values of the perturb_data function in the StabilityAnalysisRandomNoise.py, StabilityAnalysisSynonyms.py, and StabilityAnalysisTranslation.py files have been modified. Instead of returning a tuple containing result and RawData, the function now returns the unpacked result followed by RawData. This change improves the flexibility and usability of the function's output.
Data Handling Optimization: In the utils.py file, the creation of a raw data DataFrame using pandas has been replaced with a dictionary. This change reduces the dependency on pandas and potentially improves performance by avoiding unnecessary DataFrame operations when only a simple data structure is needed.
These changes aim to enhance the performance and maintainability of the codebase by optimizing data handling and improving the return values of key functions.
Test Suggestions
Test the perturb_data function in each StabilityAnalysis module to ensure the unpacked return values are correctly handled.
Verify that the dictionary-based raw data structure in utils.py correctly stores and retrieves original, perturbed, and similarity data.
Check for any downstream effects or dependencies that might be affected by the change from a DataFrame to a dictionary in utils.py.
Ensure that the removal of the pandas import does not affect any other parts of the codebase.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bugSomething isn't workinginternalNot to be externalized in the release notes
2 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Internal Notes for Reviewers
Some embedding tests were failing because we were trying to store multi-dimensional embedding arrays in the
RawDatausing a DataFrame. The fix changes the raw data storage from DataFrame to dictionary to properly store the embeddings.External Release Notes